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Abstract 



this article illustrates the use of linear and bilinear random effects models to represent 
statistical dependencies that often characterize dyadic data such as international relations. In 
particular, we show how to estimate models for dyadic data that simultaneously take into 
account: regressor variables and third-order dependencies, such as transitivity, clustering, and 
• balance. We apply this new approach to the relations among ph.d. of university in Iran over 

the period from 1991-2005, illustrating the presence and strength of second and third-order 
statistical dependencies in these data. 



^ 1 Introduction 

CN . Social network data typically consist of a set of n actors and a relational tie yij , measured on 
^ each ordered pair of actors i, j = 1, . . . ,n. This framework has many applications in the social and 

^ • behavioral sciences including, for example, the behavior of epidemics, the interconnectedness of the 

O ' World Wide Web, and telephone calling patterns. 

In the simplest cases, y^j is a dichotomous variable indicating the presence or absence of some 
• • relation of interest, such as friendship, collaboration, transmission of information or disease, and so 
.£h ! forth. The data are often represented by an n x n sociomatrix Y. In the case of binary relations, 
^ | the data can also be thought of as a graph in which the nodes are actors and the edge set is 
c3 {{hi) '■ Ui,j = l}-Social network analysis is a broad area of social science research that has been 
developed to describe the relationships among interdependent units (Holland and Leinhardt 1971, 
Bondy and Murty 1976). It is somewhat surprising that to date there are no published applications 
using a social network framework to study international relations since it is evident at first blush that 
international politics is about the interdependencies that appear around the world. This is perhaps 
due to the fact that most tools for social network analysis are focused on the simple case of binary 
(0-1) relations, where the data can be represented by a simple graph (see Wasserman and Faust 1994, 
Wasserman and Pattison 1996). Dealing with non-binary data (such as counts or continuous data) or 
regressor variables has not been well addressed in the social networks literature (see Hoff, Raftery, and 
Handcock 2002 for a discussion). Herein, we develop a generalized regression framework for analyzing 
and accounting for the dependencies in valued and binary dyadic international relations data. This 
approach builds on the social relations model (Warner, Kenny and Stoto 1979; Wong 1982) that 
specifies random effects for the originator and recipient of a relation or action, as well as allowing for 
within dyad correlation of relations. We expand upon previous approaches by allowing for certain 
kinds of third-order dependence using an inner product of latent, unobserved characteristic vectors. 
The use of inner products to model dependencies is new, and related to the the recent development 
of "latent space" models for dyadic data (Hoff, Raftery and Handcock 2002). 



2 Latent Space Approaches to Social Network Analysis 



In some social network data, the probability of a relational tie between two individuals may increase as 
the characteristics of the individuals become more similar. A subset of individuals in the population 
with a large number of social ties between them may be indicative of a group of individuals who have 
nearby positions in this space of characteristics, or "social space". Various concepts of social space 
have been discussed by McFarland and Brown (1973) and Faust (1988). In the context of this article, 
social space refers to a space of unobserved latent characteristics that represent potential transitive 
tendencies in network relations. A probability measure over these unobserved characteristics induces 
a model in which the presence of a tie between two individuals is dependent on the presence of other 
ties. Relations modeled as such are probabilistically transitive in nature. The observation of % — > j 
and j — > k suggests that i and k are not too far apart in social space, and therefore are more likely 
to have a tie ( Holland and Leinhardt 1971). In latent variable model it is assumed each actor i has 
an unknown position Zj in social space. The ties in the network are assumed to be conditionally 
independent given these positions, and the probability of a specific tie between two individuals is 
modeled as some function of their positions, such as the distance between the two actors in social 
space. Estimation of positions is simplified by the use of a logistic regression model, and confidence 
regions for latent positions are computable using standard MCMC algorithms. 

2.1 Distance Models 

We take a conditional independence approach to modeling by assuming that the presence or absence 
of a tie between two individuals is independent of all other ties in the system, given the unobserved 
positions in social space of the two individuals, 

p(Y|Z,X,0) = \{p(y i j\z h z j ,x i ^,a,(3) 

where X and x it j are observed characteristics which are potentially pair-specific and vector-valued 
and a,f3 and Z are parameters and positions to be estimated. A convenient parameterization is the 
logistic regression model in which the probability of a tie depends on the Euclidean distance between 
Zj and Zj, as well as on observed covariates that Xij measure characteristics of the dyad, 







P(Vi,j 


0|zj, Zj, Xij, ex, 0) 



= a + {3 1 x^j — \zi — Zj\ (1) 

3 Linear Mixed Effects Models for Exchangeable Dyadic 
Data 

Suppose we are only interested in estimating the linear relationships between responses i/ij and a 
possibly vector valued set of variables x^j, which could include characteristics of unit i, characteristics 
of unit j, or characteristics specific to the pair. In this case we might consider the regression model 

Di,j = §_ x ij "I" (2) 

where y^j is typically not defined. It is often assumed in regression problems that the regressors x^j 
contain enough information so that the distribution of the errors is invariant under permutations 



of the unit labels. This assumption is equivalent to the n x n matrix of errors (with an undefined 
diagonal) having a distribution that is invariant under identical row and column permutations, so 
that {eij : i ^ j} is equal in distribution to {^(i),^) : i ^ j} for any permutation ir of {1, • • • , n}. 
This condition is called weak row-and- column exchangeability of an array. For undirected data, such 
exchangeability implies a "random effects" representation of the errors, in that 

£ i,j ~ f(v,ai,ajni,j) (3) 

where /z, a«, a-,-, jij are independent random variables and / is a function to be specified (Aldous 
1985, Theorem 14.11). If in addition to the above invariance assumption we also model the errors as 
Gaussian, then the joint distribution can be represented in terms of a linear random effects model. 
In the more general case of directed observations, we can represent the joint distribution of the errors 
as follows: 



£ i,j = <* + a>j + (4) 

where 

cii, . . . ,a n *~ N(0 ,al) 

with effects otherwise being independent. The covariance structure of the errors (and thus the 
observations) is as follows: 

Etfj) = al + a 2 h + a 2 1 , E( £iJ e itk ) = a\ 

E(e i:j e jti ) = p a* + 2a ah , E(e id e kJ ) = a\ 

E(Eij£k,i) = , E(e it jek ; i) = i7 ah 

To analyze responses in particular sample spaces, the error structure described above can be added 
to a linear predictor in a generalized linear model: 

9 id = ftxij + di + bj + 7iJ - , E(yij\0ij) = g(0i,j) 

This is a generalized linear mixed-effects model with inverse-link function g(&), in which the obser- 
vations are modeled as conditionally independent given the random effects, but are unconditionally 
dependent. 




3.1 Modeling Third Order Dependence Patterns 

Some dependence patterns commonly seen in dydaic datasets have been given the descriptive titles 
of balance and clusterability. for example after fitting a regression model and obtaining the residuals 
'■ i 7^ j}, the theoretic definitions of these concepts are as follows: 

Definition 3.1 For signed residuals, a triad i,j, k is said to be balanced if x £j,fc x £i,k > 

Definition 3.2 Clusterability is a relaxation of the concept of balance. A triad is clusterable if it 
is balanced or the relations are all negative. The idea is that a clusterable triad can be divided into 
groups where the measurements are positive within groups and negative between groups. 



Clusterability and balanced cycle of residuals are shown garaphically in Figure 1. 
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Figure 1 : Balance and Clusterability of Cycles 

Hoff et al. (2002) used simple functions of latent characteristic vectors in a fixed effects setting to 
capture some forms of t balance and clusterability. For example, they considered models in which 
= P'xij + f( z ii z j) where f(zi,Zj) = |zj — Zj\. we consider a similar approach using the inner 
product kernel f(zi, Zj) = z\z^ and give random and fixed effects interpretations. Adding the bilinear 
effect to the linear random effects in models (4) gives 

= Q>% 4~ aj + + , = ZjZj (5) 

to suggest 

Zl ,...,z n l ^ d MVN(0 ,a 2 J KxK ) 
the nonzero second and third order moments are 

£(40 = 2al + 0-* + kat , E{e h] e hk ) = a\ 



E ( £ i,j £ j,i) = a 7 + 2 °a + k<J z > E ( £ i,j £ k,j) = & 



a 



Thus the effect £y = z^Zj can be interpreted as a mean-zero random effect able to induce a particular 
form of third-order dependence often found in dyadic datasets. 



4 Bilinear Mixed Effects Models Parameters Estimation 

To obtain a "cleaner" partition of the variance and a more efficient MCMC sampling scheme, in model 
(2) we decompose x^ into Xy = (xij,x Sti ,x s j) i.e. into dyad specific regressors x it j , sender specific 
regressors x s ,i and receiver specific regressors x s j. The generalized bilinear model is then rewritten 
as 

PdXij ~\~ /3 _^-s,i fij&B,3 £ i,j 

or equivalently 

0i,j = PdXij + Si + Sj + + z'iZj (6) 

S» /3 g X s ^ -|- Oj 

where x S j = (0/5, and /3 = {(3q,(3 s )'. This parameterization for the linear unit-level effects is 
similar to the "centered" parameterizations suggested by Gelfand et al. (1995, 1996). Note that an 
intercept can be thought of as both a sender or receiver specific effect. For symmetry, we include the 
constant 1/2 at the beginning of each x S i and x S) j vector. 

Using the above reparameterization for 9ij, we estimate the parameters for the generalized bilinear 



regression model by constructing a Markov chain in /3 , a^, a%, E 7 , Z} (where Z denotes the kxn 
matrix of latent vectors), having p((3d, ft , cr^, E 7 , Z) as the invariant distribution. This is obtained 
via an algorithm based on Gibbs sampling, which also samples s,r and the 0's. The basic algorithm 
is to iterate the following steps: 

1. Sample linear effeects: 

(a) Sample (3 d , s\fi , a^, a%, S 7 , 0, Z(linear regression); 

(b) Sample (3 \s,o\ (linear regression); 

(c) Sample of and £ 7 from their full conditionals. 

2. Sample bilinear effects: 

(a) For % — 1, . . . , n sample Zj|{zj, j ^ i},0, (3, s, a%, S 7 (a linear regression); 

(b) Sample a\ from its full conditional. 

3. Sample dyad specific parameters: Update (0ij,0j,i) using a Metropolis-Hastings step: 
(a) Propose 



20-MVNff^ + ai + ^ + Z h),S 

9lJ Wp-Xjt + aj + Oi + rfp ! 



(9* \ 
£i J with probability 

. //'(//;, ^;,)/'!//,; (>;,.,) , 

a = mm — ' . — , 1 

\p(vij\eij)P(y s M 

for more detail see Metropolis et al.(1953) and Hastings et al. (1970). Various combinations of 
the above steps can be used to estimate different models. The steps in 1 alone provide a Bayesian 
estimation procedure for the linear regression problem having an error covariance as in (2). Bayesian 
estimation of the normal bilinear model with the identity link could proceed by replacing each 9ij 
with i/i j and only iterating steps 1 and 2. Estimation of a generalized linear mixed effects model with 
random effects structure given by (2) could proceed by iterating steps 1 and 3. The full conditional 
distributions required to perform steps 1 and 2 are given below. 

4.1 Conditional Distributions for the Linear Effects Components 

Similar to Wong's (1982) approach to the invariant normal model, we let 

u iJ — + — 2ZjZj 



We then have 



v iJ = &ij - f or i < 3 

Uij = /3 d (xij + Xj.i) + 2{ Si + Sj) + b Ui] , 5 Uij = 7ij + 7^ 
Vij = 

with definition u = {u^j}, 5 U = {S Uij } and X u the appropriate design matrices: 

Pi 



u = X u (^)+5 u (7) 



and 



u ~ MVN(X U &, al l M ) , a 2 u = 4<r* 



(8) 



where M = and $ = ((3 d s')'. we have 

S ~MVN(X s ^,a 2 a I nxn ) 
and X s = (x Sjl , x s>2 , . . . , x s n )'. The full conditional distribution of model (6) is then 
L(u\p d ,s,a 2 JxL(s\p,al) 



oc exp { - \ [( u - X u $ )'( u - X u d> )/^} 



(9) 



; s 's + /3'x' s xj -2/?'x; s )/^ 



(10) 



joint posterior distributions using approach Bayesian is then proportional to product of prior density 
and function likelihood (gelman 2003): 

t(s,&,/3 ,<7a»0?|u) oc 



L(u\(3 d , s, <r 2 )L(s|/3 , al)7T(p d )n(p)7T(al)n(a. 



7) 



(11) 



note that we assume the parameters is independent. 
•Full conditional of ((3 d , s) 

The full conditional distribution of ((3 d , s) is then proportional to joint posterior density and obtain 
with omitting the terms that uncondition to ((3 d , s). 

n((3 d , s\(3_ s , a\, 4 u) oc L(u\(3 d , s, aJ)L(s|^, <7>(&) 

For a multivariate normal (/i^ d , cr| ) prior distribution on and then with omitting the terms that 
uncondition to (/3d, s): 

7r(/3 d , s|/3 , a* u) oc 



X s/ 3 /a f 2 ) +A " U / (J ' 



(7 



/3d 







c a I nX n 



+ X' U X U / a 2 



The conditional distribution is thus 



P d: sW,<jla> u~MWV(//,£) 



where 



and 



/i = S 



ft 



< 2 + X>/a 2 



] i X' /(r- 

u °a J -nxn 



~1 -1 



•Full conditional of (3 

— s 

The full conditional distribution of (3 s is then proportional to joint posterior density and obtain with 
omitting the terms that uncondition to (3 . using (11) 

7r(£ s |s,<7»oc L(s|^,^)7r(^) 

For a multivariate normal on (3 s as follows (3^ ~ MVN (/^ , ^ an d then with omitting the terms 
that uncondition to (3 . 

£Js,<x>~MWV(^£) 

where 

a = s [s, + x' s s/al] , s = (e£ + x' s x s /^) 1 

•Full conditional of a\ 

The full conditional distribution of a\ is then proportional to joint posterior density and obtain with 
omitting the terms that uncondition to a\. 



a 

7T(<Ta|a) OC L(a|<7a)7T((7a) 

note that a 1: . . . , a n iV(0 , cr^) 

L(a|^) oc \al\-%exp { -If^/^j 

For a inverse gamma distribution on <r a as follows er a ~ /C7(o; al , a a2 ) The full conditional distribution 
of <y\ is then 

1 n 

a a |a ~ JG(a a i + ^™ , «a2 + ^ ^/^a) 

2 i=i 

•Full conditional of a* 

note that 

to find The full conditional distribution of a\ using (11) 

7r(al\(3 d ,s,a^,u) oc L(u|/3 d , s, <fyir(ol) 

For a inverse gamma distribution on cr^ as follows ~ /G(ai u i, a u2 ) and with omitting the terms 
that uncondition to .The full conditional distribution of er„ is then 

ol\(3 d , s, o-J, u ~ IG(a ul + \m , a u2 + ( u - X u $ )'( u - X u $ )) 

4.2 Conditional distributions for the Bilinear Effects Component 

Let e it j = (9 it j + 9j ti — u it j)/2, the residual of the symmetric part of the matrix of 0's after fitting the 
linear effects, and let 5 Uji j = 7^ + 7^. Considering the full conditional of Zj, we have 

e it i = z-zi + 5 u ,i,i/2 
e i)2 = z^z 2 + 5 u ,i,2/2 

6j,n = z i z ra + ^u,i,n/2 (12) 



can write the equations to face matrix: 



(13) 



where errors vector to face {e^j : % ^ j} and Z_j matrix fcx (n — 1) obtain to omit of i column 
Z. for example for % — 1: 



ei,-i 



\ei,n/ 



v<7 



note that Var(5 U)ij j/2) = o"„/4 likelihood function model (13)is then: 

L(ei ,-i\Z-i, Zi, al) oc exp j - i [4( _< - Z'^Z; )'( _< - Z'_^ )/<r„ 

posterior distributions Zj is proportional to product of prior density and function likelihood, to 
assume z* ~ MVN(0, £ z ) 

7r(zi|Z_j, cr„, £ z ) oc L(e { _i|Z_j, Zj, cr^)7r(z i ) 

•Full conditional of z* 

The full conditional distribution of z, is then proportional to joint posterior density and obtain with 
omitting the terms that uncondition to Zj. 



Tr^lZ^cr^Es) oc exp j - - x 4( z^Z^Z'^Z;/^ _ 2z' i Z_ i e i - i /al 



xexp < — 



for other hands 

7r(z i |Z_ i ,o-^,S z ) oc exp jz^4Z_;e; _i/<r„ 

the full conditional of Zj is multivariate normal (//, S) with 



2** 



S z 1 + 4Z_;Z'_>* 



A* = 4SZ_ i e i) _ i /aS , S = (S z 1 + 4Z_ i Z'_>u) 



-i 



4.3 Conditional distributions for the matrix covariance E z 

to assume Z; ~ MWV(0, E z ) 



L( Zl , . . . , z n |S z ) oc |S z |-2exp j - - trE^Z'Z j 

posterior distributions for S z is proportional to product of prior density and function likelihood. 

tt(S z |Z) oc L(zi, . . . , z„|S z )7r(S z ) 

•Full conditional of S z 

The full conditional distribution of S z is then proportional to joint posterior density and obtain with 



omitting the terms that uncondition to S z . to assume prior distributions, inverse wishart for I] z as 
follows S z ~ IW ( £ z0 , v ) we have 



7r(S z |Z) oc |S z | 2 eX p 



5 ^' 



J z0 



Z'Z 



to note that property of inverse wishart, The full conditional distribution of S z is 

S Z |Z~IW(S Z0 + Z'Z , u + n) 

Alternatively, if we restrict S z to be allkxk and use an inverse gamma for <r z as follows <7 Z 
IG(a , ai) and with omitting the terms that uncondition to a\ 



7r(S,|Z) oc a, 



2 -(a +nfe/2+l) 



exp {- [c^ + trZ'Z/2]/^ 2 } 



then 



<r 2 |Z ~ IG(a + nk/2 , ai + trZ'Z/2) 



4.4 Selecting the Latent Dimension 

One issue in model fitting is the selection of the dimension k of the latent variables z. Selection 
of K could depend on the goal of the analysis. For example, if the goal is descriptive, i.e. the 
desired end result is a decomposition of the variance into interpretable components, then a choice of 
K = 1, 2 or 3 would allow for a simple graphical presentation of a multiplicative component of the 
variance. Alternatively, one could examine model fit as a function of K based on the log-likelihood, 
having obtaind posterior estimates = a, b, Z, S 7 } for a range of K, one can compare the 

value of logp(Y|^( fc )) to assess model fit versus complexity. Az a funtion of K, the Akaike information 
criterion(AIC) and Bayesian information criterion (BIC) are 



AIC(k) = -2\ogp(Y\¥ k) ) + c+ [2n] x k 



and 



BIC(k) = -2\ogp(Y\¥ k) ) +c + [nlogQ 



x k 



where the suggestion is to prefer the model with a lowest value of the criterion, for hierarchical model, 
Spiegelhalter, Best, Carlin, and van der Linde (2002) suggested usisng the deviance information 
criterion (DIC), 

DIC(k) = -2 logp(Y|^ (fe) ) + 2 x pg } 
where the penalty p$ on the model complexity is given by 

pg) = -2 x [E[\ogp(Y\¥ k) )\Y] - logp(Y|^ (fe) )} 

(k) 

this expection can be approximated by averaging over MCMC samples, the penalty term p D has 
been referred to as the "effective number of parameters" because it has this interpretation in normal 
linear model. 



5 Data Analysis: Relations Between Universities 



for fitting bilinear mixed effect model, we analyze data on relations between 30 university in Iran. We 
take our response yij to be the total number of "positive" actions reportedly initiated by university i 
with target j from 1991 to 2005. Positive actions here include articles in connection statistics sciences 
that they have been published in Iranian Statistical Conference book. Xij is the geographic distance 
between university % and j and Xi is log population(number of master in university). The occurrence 
of a action between any two given countries in these data is rare, with 86of the nondiagonal entries of 
the sociomatrix being equal to 0. some descriptive ploys of the raw data are given in Figure 2. Panel 
(a) plots the response on a log scale versus the geographic distance in thousands of kilometer between 
university % and j. More precisely, this distance is the minimum distance between two university, 
which is if % and j in one city. On average, the number of action decreases as geographic distance in- 
creases. Panel (b) plots log(l+X^.j^i Ui,j), versus log population , wich suggests a positive relationship 
between response and population. The quantities J2j:j^iUi,j is typically called the outdegree of unit i. 
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Figure 2 : Relationships Between (a) Response and Geographic Distance, and 



{b)Outdegree and Population 
5.1 Evidence of Third-Order Dependence 

Before fitting a somewhat complicated bilinear Poisson regression model, we evaluate the necessity of 
such an effort by looking for evidence of balance and clusterability in the data, we do this by fitting 
a simple linear regression on the logtransformed data and examining the residuals for third-order 
dependencies of the types described, more specifically, we obtain ordinary least squares estimates 
for the regression model 

\og{yi,i + 1) = A) + PdXij + ai + a,j + £ij 
There are several indications of third-order dependence in these residuals: 

1. Because the mean of the residuals is 0, independence of the residuals implies that the average 
value of the product £ij£j,k£k,i over triads also should be ( the concept of independence of the 
residuals is E(£ij£j t k£k,i) — 0). As discussed in section 3.1, a value larger than would indicate 
some degree of balance. The empirical average over triads turns out to be 0.0035. 



2. The fraction of residuals that are positive is p = 0.45(the distribution of residuals is not sym- 
metric). Under independence, the proportion of cycles that we would expect in the two balanced 



categories shown in Figure 1 (+++ and H — ) are p 3 = 0.091 and 3p(l — p) 2 = 0.4, whereas 
the observed proportion are 0.115 and 0.385. The observed proportion in the unclusterable 
category (++-) is 0.333 and the value expected under independence is 3p 2 (l — p) — 0.334. 
The expected proportion in the clusterable but unbalance category is 0.166, and the observed 
proportion is 0.167. 

3. As described in section 3.1, in a balance system we expect that if £jj > 0, then £j >k and ^ 
will have the same sign, such a pattern is shown graphically in Figure 3, which for each pair 
{i,j} plots £jj versus the proportion of other nodes k for which ^ x > 0. Although the 
distribution of residuals is far from normal, we do see some indication of this type of third-order 
dependence. As we would expect from a balanced system, pairs {i, j} for which £jj is less than 
generally have dissimilar residuals to other, -P(^,fc x ^ > 0) tends to be 0.47, pair {i,j} for 
which £jj is greater than generally have similar residuals to other, P(£,i t k x £j,k > 0)tends to 
be greater than 0.52. 

in the next section we analysis results of fitting bilinear mixed model. 
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Figure 3 : Balanced Residuals 



5.2 Model Selection 

We fit the bilinear mixed effects model to the data using a Poisson distribution and the log-link, so 
that each response y^j is assumed to have come from a Poisson distribution with mean exp(9i t j) , 
and that the y's are conditionally independent given the £'s. assume following model: 

= /3d%i,j T" /^ s x s,i T" /^ s x s,jf T" £i,j 

where x Sj j = (0/5, Xi)' and (3 g = (/3 ,f3 s )'. Table 1 includes are the the marginal probability criteria, 
the DIC penaltypo, in the third column, the AIC criterion , the BIC criterion and the DIC crite- 
rion. In terms of the marginal likelihood criterion, the biggest improvements in fit are in going from 
K — 1 to K — 2 and from K = 2 to K = 3. Using the AIC criterion and penalizing the improvement 
in likelihood by the number of additional parameters, we would choose K — 0. The BIC, with a 
higher penalty on the number of parameters, favors K — 0. In contrast, the DIC favors K — 1. 



Note that the increase in the DIC penalty tends to decrease the DIC. But note that for models whit 
K — 1 and K = 2 the DIC criterion have like predictive ability, then based on these results and our 
ability to plot in two dimensions, we choose to present the analysis of the K — 2 model in more detail. 



K ' logp(Y|/?,s,Z,^) P D AIC{K) BIC{K) DIC{K) 



- 167.30 - 6.00 334.6 334.6 322.62 

1 - 178.82 - 29.28 417.6 436.79 299.08 

2 - 169.30 - 9.49 458.6 496.9 319.80 

3 - 152.96 21.66 485.8 543.3 349.24 

4 - 157.03 12.62 554.0 630.6 339.30 



Table 1 : Evaluation of K 

One Markov chains of length 100,000 was constructed using the algorithm described in section 4. 
The second chain used starting values obtained from the following procedure: 

• fitting generalized linear model, using geographic distance as a regressor and sender and receiver 
labels as factor variables. 

log(yjj + 1) = p d Xij + ai + aj + &j 

we let parameters of prior distribution for (3d to case var{(3d) = <j\ and f/,p d = (3d 

• letting §i = (ai + %)/2 and fitting ordinary regression model we have 

&i — (3 x Si j -|- a, L 
we let = coviPj,^ = £. and (a al , a a2 ) = (2, &l) 

• The iteratively reweighted least-squares fitting procedure produces a matrix R of working residuals, 
with the of diagonal elements undefined. An estimate Z of Z was then obtained by approximating R 
with a matrix product of the form Z'Z. This can be done with an iterative least-squares procedure, 
similar to the Gibbs sampling procedure outlined in Section 4: see ten Berge and Kiers (1989) for 
more details on this problem. 

• An estimate of S 7 is then obtained from E = R — Z'Z. we define <t„ = var(E + E') then 
update from cr„. 
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Figure 4 : Marginal MCMC output for parameters (3d and (3 

Samples of parameter values were saved from the Markov chains every 100 iterations, and for 
example for (3d and (3q are plotted in Figures 4. The chain appear to have achieved stationarity after 
about 10,000 iterations, and so we base our inference on the saved samples after this point. 
Posterior means and standard deviations of the model parameters, based on the saved MCMC samples 
are given in Table 2. 



As in the raw data, we see a negative relation between response and geographic distance 
i?[/3d|Y] = —2.38, and a positive relation between response and log number of master in university 
(E[/3 S \Y) = 0.65). 
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Table 2 : Posterior means and standard deviations for k = 2 

Next, we analyze the posterior distribution of the the k x n matrix of latent vectors Z. In Figure 
5 has ploted sample z's over the plot of the means. Generally, two university will be modeled as 
having z in the same direction if they have large responses to one another relative to their total 
number of actions and covariate values, and/or if their responses involving other university are sim- 
ilar. For example, 3 university Shiraz, Olum pezeshky Shiraz and Azad Shiraz have placed in the 
same direction and so these university are similar because they had minimum 3 partner article, also 
Olum pezeshky Shiraz and Azad Shiraz did not have connection with other university exept Shiraz. 
Mashhad university had contacted at least with 12 university, thus it had reposed in central of plot, 
university Tarbiat Modares in addition to connections with other university, it had 11 partner article 
with Azad Oloum Tahghighat so this two university have placed in the same direction. 
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Figure 5 : Posterior mean of Z 
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